Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Vulkan shader gen binary path when Cross-compiling #11096

Merged
merged 4 commits into from
Jan 8, 2025

Conversation

ag2s20150909
Copy link
Contributor

Cross-compiling uses the host system's tools,hard-coded paths won't work.

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Jan 6, 2025
@0cc4m
Copy link
Collaborator

0cc4m commented Jan 6, 2025

There's already an open PR for cross compiling fixes (#10448), which has been taking a while (@sparkleholic @bandoti). Are the changes related or is this a separate fix?

@ag2s20150909
Copy link
Contributor Author

There's already an open PR for cross compiling fixes (#10448), which has been taking a while (@sparkleholic @bandoti). Are the changes related or is this a separate fix?

This is a fix for #11037

@bandoti
Copy link
Contributor

bandoti commented Jan 6, 2025

These are two separate issues. In the case of #11037 they seem to have some super project which includes Llama.cpp as a submodule from another CMakeList.txt.

@sparkleholic has a fix for cross compiling for example using Linux on x86_64 with a build target of aarch64. In this case, it ensure the shaders are generated using the build system architecture.

@0cc4m
Copy link
Collaborator

0cc4m commented Jan 8, 2025

Thank you for the fix!

@0cc4m 0cc4m merged commit bec2183 into ggerganov:master Jan 8, 2025
48 checks passed
@giladgd
Copy link
Contributor

giladgd commented Jan 8, 2025

This PR reintroduced the issue that was fixed in #11037 when compiling with Vulkan enabled on Windows x64 using LLVM.

@ag2s20150909 On which setup have you encountered an issue that this PR solves? I'd like to try to reproduce it so we can find a fix that works for both cases.

@0cc4m
Copy link
Collaborator

0cc4m commented Jan 8, 2025

This PR reintroduced the issue that was fixed in #11037 when compiling with Vulkan enabled on Windows x64 using LLVM.

Damn, sorry @giladgd . I should have checked with you before merging. The change didn't look problematic to me. But does that mean CMAKE_CROSSCOMPILING is true on your system?

@giladgd
Copy link
Contributor

giladgd commented Jan 8, 2025

@0cc4m I don't cross-compile, but since I use a toolchain file that does set(target x86_64-pc-windows-msvc), it's probably what causes CMAKE_CROSSCOMPILING to be true.
The compilation now fails when I compile from Windows x64 to Windows x64 using LLVM with a custom toolchain for this, but it also fails when compiling on Windows arm64 to Windows arm64 using LLVM with llama.cpp's toolchain file.

It's a bit hacky, but I think a fix for this can be to test whether there's a file at the given path, and add the $<TARGET_FILE_DIR:vulkan-shaders-gen>/ part conditionally if it doesn't exist.

@bandoti
Copy link
Contributor

bandoti commented Jan 8, 2025

Hmm. This sounds like you need the cross-compile fixes in #10448 when compiling from Windows to Arm64 (definitely). In the other case, you are effectively cross-compiling as well, as you said, except it's from the same architecture to itself.

I think the way to fix this is when NOT CMAKE_CROSSCOMPILING we need to throw in another check for whether the host architecture is the same as the build architecture. The terminology can be a bit confusing. Here's the general idea:

if(CMAKE_CROSSCOMPILING AND CMAKE_SYSTEM_PROCESSOR STREQUAL CMAKE_HOST_SYSTEM_PROCESSOR)
    message(STATUS "Cross-compiling is enabled, but the target system is the same as the build system.")
else()
    if(CMAKE_CROSSCOMPILING)
        message(STATUS "Cross-compiling is enabled, and the target system is different from the build system.")
    else()
        message(STATUS "Cross-compiling is not enabled.")
    endif()
endif()

@0cc4m
Copy link
Collaborator

0cc4m commented Jan 8, 2025

@0cc4m I don't cross-compile, but since I use a toolchain file that does set(target x86_64-pc-windows-msvc), it's probably what causes CMAKE_CROSSCOMPILING to be true.

Would the original check (CMAKE_SYSTEM_NAME STREQUAL CMAKE_HOST_SYSTEM_NAME instead of NOT CMAKE_CROSSCOMPILING) have worked in your case?

@bandoti
Copy link
Contributor

bandoti commented Jan 8, 2025

@giladgd If you would like to test against my branch with the latest fixes from @sparkleholic here: https://github.com/bandoti/llama.cpp/tree/sparkle_master_fix, I have it updated to latest with the fixes. We're just waiting to here back from the original author to get it merged in. I am curious if this would fix the problem for you.

@giladgd
Copy link
Contributor

giladgd commented Jan 8, 2025

Would the original check (CMAKE_SYSTEM_NAME STREQUAL CMAKE_HOST_SYSTEM_NAME instead of NOT CMAKE_CROSSCOMPILING) have worked in your case?

@0cc4m It seems to work well both on Windows x64 and Windows arm64.

@giladgd If you would like to test against my branch with the latest fixes from @sparkleholic here: https://github.com/bandoti/llama.cpp/tree/sparkle_master_fix, I have it updated to latest with the fixes. We're just waiting to here back from the original author to get it merged in. I am curious if this would fix the problem for you.

@bandoti Your branch fails to compile on my Windows x64 machine with LLVM with this error:

CMake Warning at llama.cpp/ggml/src/ggml-vulkan/CMakeLists.txt:18 (message):
  Neither MSVC nor clang found
Call Stack (most recent call first):
  llama.cpp/ggml/src/ggml-vulkan/CMakeLists.txt:116 (detect_host_compiler)


-- Configuring incomplete, errors occurred!
CMake Error at llama.cpp/ggml/src/ggml-vulkan/CMakeLists.txt:118 (message):
  Host compiler not found

I can give you setup instructions on how to replicate this on a Windows x64 machine with my custom LLVM toolchain so you can test it yourself; let me know if you're interested.

@bandoti
Copy link
Contributor

bandoti commented Jan 8, 2025

@giladgd Yes that would be helpful! I will test it and see what happens. Thanks.

@giladgd
Copy link
Contributor

giladgd commented Jan 8, 2025

@bandoti In node-llama-cpp there's a CLI command that builds llama.cpp from source; by default, it detects what would be the backend that would work best on your machine (it prioritizes CUDA, then Vulkan) but you can override it to build a specific backend.
I also use this command in the CI to build the prebuilt binaries that ship with each version of the library.

To replicate the Windows 10 x64 setup I tested on, make sure you have Node.js v22 installed, all the Visual Studio build tools components as detailed here, and also the Vulkan dependencies described here.

Then run these commands:

mkdir repro
cd repro
npm init -y

npm install [email protected]
npx --no node-llama-cpp source download --gpu vulkan

cd node_modules/node-llama-cpp/llama
rmdir llama.cpp
git clone https://github.com/bandoti/llama.cpp.git
cd llama.cpp
git checkout sparkle_master_fix
cd ../../../..

npx --no node-llama-cpp source build --gpu vulkan

The last command is the one that failed to build on my Windows 10 x64 machine.

It attempts to use LLVM automatically if it detects that it's installed (either as part of the Visual Studio build tools, or separately), and if it doesn't find it then it fallbacks to using MSVC.
If the LLVM build fails it then automatically tries again with MSVC, so only regard the first try of that command.
(On the CI of node-llama-cpp, it doesn't try again with MSVC, hence I reencountered this issue now)

@giladgd
Copy link
Contributor

giladgd commented Jan 8, 2025

@0cc4m I've opened #11149 with the fix you suggested

@bandoti
Copy link
Contributor

bandoti commented Jan 9, 2025

@giladgd I forgot to mention that there's a way to specify a separate toolchain for the Vulkan-shaders-gen which bypasses finding on the system path.

Please try passing this to cmake using my sparkle_master_fix branch.

-DGGML_SHADERS_GEN_TOOLCHAIN=/path/to/build-system-toolchain.cmake

This toolchain will then be used to build the runnable app on build system.

@bandoti
Copy link
Contributor

bandoti commented Jan 9, 2025

That being said the issue may be the find_program commands need NO_CMAKE_FIND_ROOT_PATH to prevent it from searching wrong dir. hierarchy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants